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Abstract 

Background: Cellular functions depend on genetic, physical and other types of interactions. As such, derived 
interaction networks can be utilized to discover novel genes involved in specific biological processes. Epistatic 
Miniarray Profile, or E-MAP, which is an experimental platform that measures genetic interactions on a genome- 
wide scale, has successfully recovered known pathways and revealed novel protein complexes in Saccharomyces 
cerevisiae (budding yeast). 

Results: By combining E-MAP data with co-expression data, we first predicted a potential cell cycle related gene 
set. Using Gene Ontology (GO) function annotation as a benchmark, we demonstrated that the prediction by 
combining microarray and E-MAP data is generally >50% more accurate in identifying co-functional gene pairs 
than the prediction using either data source alone. We also used transcription factor rrF)-DNA binding data (Chip- 
chip) and protein phosphorylation data to construct a local cell cycle regulation network based on potential cell 
cycle related gene set we predicted. Finally, based on the E-MAP screening with 48 cell cycle genes crossing 1536 
library strains, we predicted four unknown genes {YPL158C, YPR174C, YJR054W, and YPR045Q as potential cell cycle 
genes, and analyzed them in detail. 

Conclusion: By integrating E-MAP and DNA microarray data, potential cell cycle-related genes were detected in 
budding yeast. This integrative method significantly improves the reliability of identifying co-functional gene pairs. 
In addition, the reconstructed network sheds light on both the function of known and predicted genes in the cell 
cycle process. Finally, our strategy can be applied to other biological processes and species, given the availability of 
relevant data. 



Background 

According to [1], "mutations in two genes produce a 
phenotype that is surprising in light of each mutation's 
individual effects. This phenomenon, which defines 
genetic interaction, can reveal functional relationships 
between genes and pathways." Thus, deciphering genetic 
interaction networks via high-throughput technologies 
can both reveal the schematic wiring of biological pro- 
cesses and predict novel genes. Recently, several such 
high-throughput technologies have been developed to 
identify genetic interactions at the genome scale, includ- 
ing Synthetic Genetic Array (SGA) [2], Diploid-based 
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Synthetic Lethality Analysis on Microarrays (dSLAM) 
[3], and Epistatic Miniarray Profile (E-MAP) [4]. The 
first two approaches aim to identify synthetic lethal, or 
negative, interactions, meaning that the double mutant 
is more lethal than the corresponding single mutants. 
On the other hand, assuming that the expected pheno- 
type of a double mutation reflects the additional effects 
of the single mutations, E-MAP, an extension of SGA, 
gains power by identifying positive as well as negative 
interactions, which, in this case, would indicate that the 
double mutant is healthier than expected. 

Here, we exploited the E-MAP methodology to dis- 
cover novel genes involved in the cell cycle process in 
budding yeast. The distinct advantage of using E-MAP 
is the potential of discovering functionally associated 
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genes which do not otherwise physically interact. Physi- 
cal interaction assays, such as the yeast two-hybrid sys- 
tem or DNA-binding microarrays, are unlikely to reveal 
these associations. Despite the superiority of E-MAP, 
interpretation of the data is still challenging. First, 
genetic interactions occur both between and within 
functional modules. Thus, the function of a gene cannot 
be determined by its interacting partners. Second, E- 
MAP suffers from high false positive and negative rates, 
making it difficult to infer genetic interaction accurately 
and sufficiently. Consequently, the integration of exter- 
nal information, such as gene expression, Transcription 
Factor (TF)-DNA binding (chip-chip) and protein phos- 
phorylation, is necessary in order to identify novel genes 
involved in the cell cycle process. 

Several methods have been developed to integrate 
multiple types of data to infer a transcription regulatory 
network in eQTL analysis, including mRNA expression, 
chip-chip, physical interaction and protein phosphoryla- 
tion [5-8]. In this paper, we integrated genetic interac- 
tion and other genomic data to construct a specific 
network which we then applied to the cell cycle process 
in budding yeast. 

Results 

Construct a potential cell cycle-related gene set 

As indicated in Figure 1, our strategy, which integrates 
multiple types of data, aims to include all potential cell 
cycle genes within the known cell cycle gene set. Since 
both genetic interacting and co-expressed gene pairs 
tend to be co-functional, we hypothesized that a poten- 
tial cell cycle related gene set with higher confidence 
can be achieved through combining the two data 
sources, compared with using either data alone. 

To accomplish this, we first quantitatively measure 
whether genetic interactions and co-expression indicate 
co-functional membership and, if so, to what degree. 
The E-MAP method was adopted for genetic interaction 
analysis. Forty-eight known cell-cycle genes (KCCGs, 
Table SI in Additional file 1) were screened against a 
library of 1536 test strains in budding yeast, yielding a 
quantitative value (S-Score defined in [3]) for 67680 
gene pairs (91% of all possible pair-wise measurements). 
Co-expressed data were then calculated from 8 groups 
of time-course expression datasets generated in four 
previously published studies (see Methods and materials 
for details). To calculate the enrichment of co-functional 
gene pairs over random gene pairs, we first compute the 
fraction f of interactions at each S-score (or cc, correla- 
tion coefficient of expression) (Figure 2A and 2B) or 
simultaneously more extreme than s and cc (Figure 2C 
and 2D) that fall inside one biological process term in 
GO for certain bin sizes. Then the enrichment is the 



ratio f/r, where r is the fraction of random gene pairs 
which participate in the same GO biological processes. 
As expected. Figure 2A confirms that gene pairs having 
a higher cc are more likely to be co-functional. Also, 
Figure 2B shows that gene pairs with both significant 
positive and negative S-scores are more likely to be co- 
functional. By comparing the enrichment between Fig- 
ures 2A and 2B, it is apparent that extreme S-scores 
could indicate co-functional membership more 
efficiently. 

When combining these two kinds of data, we found 
that they were complementary. As shown in Figure 2C, 
for a certain cut-off of S-scores, gene pairs with a higher 
correlation of expression are more likely to be co-func- 
tional, and vice versa. Therefore, the results approved 
our original hypothesis that combining these two kinds 
of information could help to construct a more accurate 
potential cell cycle related gene set. We adopted an area 
by which to define significantly interacting gene pairs 
based on the data in Figure 2C. For a positive genetic 
interaction area, we require that the enrichment over 
random be larger than 2, and for a negative genetic 
interaction area, we require that it be larger than 4. 
Then the constraints are (S-score>2.5 and cc >0.9) or S- 
score>6 or (S-score<-3 and coO.9) or (S-Score<-14 and 
coO.85). Compared to the most powerful method at 
each point, the combination is generally >50% more 
accurate in the areas defined above (Figure 2D). Finally, 
259 gene pairs between 206/1536 test strains and 48 
KCCGs passed the filter. We use these 206 test genes as 
the potential set of cell cycle-related genes (PCCGs). 

Recovery of known genetic interactions with our E-MAP 

We compared our E-MAP data with the benchmark 
data. Similar to previous work [9], we tested the sensi- 
tivity and precision of the E-MAP data (see Methods 
and materials). Compared to genetic interactions in 
BIOGRID, both the positive and negative interactions 
are very precise (p-value < 10"^°). 

We also tested the efficiency when combining E-MAP 
with DNA microarray data. When the co-expression test 
was applied, the significance level of precision increases 
around 2-fold (Table S2 in Additional file 2). This result 
indicates that co-expression does indeed provide extra 
information about genetic interaction. Hence, our strat- 
egy can be used to identify potential cell cycle genes 
and their relationships with known cell cycle genes, thus 
enabling us to construct a reliable network. 

We also compared our S-score with previously pub- 
lished large-scale genetic interaction data [8]. Signifi- 
cantly interacting gene pairs show obvious correlation 
between the two datasets (r=0.64. Figure SI in Addi- 
tional file 3). 
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Figure 1 Overview of our strategy to integrate multiple types of data to identify cell cycle gene and Infer regulatory pathways. 



Functional enrichment analysis in the PCCGs 

Functional enrichment analysis was performed on all 
GO biological process terms in both positive and nega- 
tive parts of PCCGs. We defined the positive part as 
those genes having (S-score>2.5 and cc >0.9) or S- 



score>6 and the negative part as those genes having 
(S-score<-3 and coO.9) or (S-Score<-14 and coO.85). 
We distinguished these two parts because the principles 
on which positive and negative genetic interaction are 
based may be different for functional analysis, as 
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Figure 2 Combining genetic interaction and co-expression data to define co-functional gene pairs. High correlation coefficient (A) and 
extreme S-Score both correspond with known co-functional gene pairs in Gene Ontology (GO) (bin size is 1 Correlation and 1.5 for S-Score). (C) 
Gene pairs have more extreme S-score and larger correlation of expression have higher enrichment. (D) When combining two criteria, the 
enrichment is significantly improved in extreme S-score and large correlation area. 



discussed in a previous study [10]. For the positive part, 
only one functional category, "nucleosome organization," 
was enriched under 98% confidence level (q=0.012). For 
the negative part, five functional categories were 
enriched, including "DNA-dependent DNA replication," 
"chromatin assembly," "interphase of mitotic cell cycle," 
"cell cycle checkpoint" and "regulation of organelle orga- 
nization" (q=0.014, 0.013, 0.015, 0.009, 0.012). All these 
biological processes can either be interpreted as related 
to the cell cycle process, or just part of it. In addition, 
KCCGs were found to be mainly involved in "regulation 
of organelle organization," "regulation of mitotic cell 
cycle," "interphase of mitotic cell cycle," "regulation of 
cell cycle process" and "cell cycle checkpoint" processes. 
Hence, in the negative part, there are more directly co- 
functional genes than the positive part. This is 



consistent with the surprising conclusion of previous 
work [9] indicating that negative, in contrast to positive, 
genetic interactions always occur between genes with 
overlapping functions. 

Finally, we also analyzed the functional enrichment of 
all PCCGs. Three functions, including "chromatin 
assembly," "regulation of organelle organization" and 
"nucleosome organization," were enriched (q=0.015, 
0.009, 0.002). This suggests the importance of separating 
PCCGs into two parts for a functional enrichment ana- 
lysis. Such separation further helps us to understand 
how known cell cycle genes, both positive and negative, 
interact in terms of their functions and also helps us to 
find specific functions only enriched in one of the two 
parts, such as "DNA-dependent DNA replication," 
"interphase of mitotic cell cycle" and "cell cycle 



Wang ef al. BMC Systems Biology 201 1, 5(Suppl 1):S9 
http://www.biomedcentral.eom/1752-0509/5/S1/S9 



Page 5 of 12 



checkpoint." which only enriched in negative part but 
not in all PCCGs. 

A cell-cycle transcriptional network based on the PCCGs 
and KCCGs 

In the next step, we searched for main transcription fac- 
tors (TPs) regulating both the PCCGs and the KCCGs 
based on TF-DNA binding data (Chip-Chip), and then we 
constructed the resulting transcriptional regulatory net- 
work. In previous studies, Chip-Chip data are usually com- 
bined with expression information to construct the 
regulatory network. In our method, periodic expression 
was required for TP inclusion. Since most genes involved 
in the cell cycle process are expressed periodically, it is 
reasonable to assume corresponding periodicity of their 
transcriptional regulators. In addition, we also assumed 
that the regulatory targets of a TP involved in the cell 
cycle would be enriched for the known cell cycle gene. 
Hence, our transcription network was based on TPs 
enriched for cell cycle targets for both potential and 
known cell cycle genes combined from the pool of PCCGs 
and KCCGs. The significance of periodicity and enrich- 
ment of cell cycle genes can be calculated (see Methods 
and materials). Both approaches tend to select TPs which 
are known to be involved in cell cycle regulation according 
to MIPS functional annotation (Pigure 3 [11]). 

Periodicity and enrichment are consistent criteria since 
most of the known cell cycle TPs rank at the top in both 
cases. However, some TPs are ranked differently (See 
Table S3 in Additional file 4). Por example, Mcml is 
ranked 6/130 in the enrichment test (ET); however, it 
ranks 124/183 in the periodic test (PT), which means that 
its expression does not show periodicity. We know that 
Mcml regulates different phases during the cell cycle 
[12,13], and its expression will not be periodic. However, 
many of its neighboring genes in the transcriptional net- 
work are cell cycle genes, making its identification possible 
in the enrichment test. Similar to Mcml, Skn7 ranks 23 
and 114, respectively, in ET and PT. In contrast, HCMl is 
ranked 3 in PT, but 42 in ET. One possible explanation of 
this apparent difference is that the PCCGs and KCCGs 
only cover a limited part of cell cycle-related genes, and 
some targets of HCMl are missing in this set. Other 
examples like YHPl are similar to HCMl. Based on this 
analysis, a TP that is significant in either test should be 
included. Hence, we use the multiplication of the two 
ranks as an index, and we use its rank to evaluate the 
priority order (see Methods and materials for details). 

To determine how many TPs should be involved, we 
examined the coverage rates of TPs. The coverage rate is 
evaluated at two levels: the fraction of genes which are 
targets of the selected TPs in the PCCGs and KCCGs and 
the fraction of gene pairs which are co-regulated by any 
one of the selected TPs. In the PCCGs and KCCGs, 232/ 



236 genes are involved in the chip-chip dataset (at least 
one TP can bind to them), and 77 gene pairs, which are 
both genetically interacting and co-expressed, can be 
simultaneously bound by the same TP. We noticed that 
when the top 25 TPs are selected, most of the 232 genes 
and 77 gene pairs (75% and 97%, respectively) could be 
covered (Pigure S2 in Additional file 5). The cover rate 
increases quite slowly when more TPs are selected. We 
therefore used these 25 TPs to construct the transcrip- 
tional network based on the PCCGs and KCCGs. 

Enrichment for cell-cycle genes and TFs 

We next determined whether the PCCGs are enriched 
with known cell-cycle genes. Among the PCCGs, we cal- 
culated the proportion of genes which are annotated to 
participate in the cell cycle process (in MIPS database) 
and used the hyper-geometric distribution to define the 
p-values. About 1/2 of the PCCGs (94/206) were deter- 
mined to be known cell-cycle and DNA processing 
genes {p = 6 x lO"''). We performed the same test to 
the selected TPs. Eighteen of them are known to be cell 
cycle TPs (p = 5 X 10"", Table S4 in Additional file 6). 

Enrichment for CDC28 substrates 

Since cell cycle events are controlled by cyclin-depen- 
dent kinases (CDKs), we investigated whether Cdkl 
{CDC28) substrates were enriched in our PCCGs and 
selected TPs. As expected, both of them turned out to 
be enriched with CDC28 substrates (Table S5 in Addi- 
tional file 7), further supporting the finding that both 
PCCGs and selected TPs are cell cycle-related. 

Formation of a cooperative transcriptional network by 
selected TFs is supported by indirect evidence 

We compared the difference between using all TPs in the 
database and only the selected 25 TPs to explain indirect 
transcriptional relationships between the 25 TPs and 232 
target genes. Based on comparing the wild-type and TP 
mutant microarray data, we could tell how one TP could 
affect the expression of each gene. This data reflects the 
transcriptional relationship between the TPs and targets 
although the relation could be indirect. This independent 
evidence, which describes the transcriptional network, 
can be utilized to validate the network we constructed. 

Between our 25 TPs and the 232 target genes, there 
are 140 indirect TP-target pairs. By using the transcrip- 
tional relationships of all 183 TPs in the chip-chip data- 
set, 103/140 pairs could be connected within three 
steps, although for more steps, quite a few indirect pairs 
could be explained (Pigure S3 in Additional file 8). We 
also tested the fraction of indirect TP-target pairs which 
could be connected by only using the relationships of 
the 25 TPs. The result (Pigure S3) shows that the sub 
TP-target network can explain 85.4% (88/103) of the 
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indirect relations in the first three steps. This result 
illustrates that the 25 TFs form a cooperative transcrip- 
tional network which can explain its indirect connec- 
tions quite well. 

Clustering of the constructed transcriptional network 

To understand the structure of our constructed tran- 
scriptional network, we used the transcriptional profile 
to do cluster analysis (Figure S4 in Additional file 9). 
That TFs with similar function regulate similar targets 



in the network can be inferred by the presence of sev- 
eral established cooperating TF clusters, such as Ace2/ 
SwiS, Fkhl/Fkh2/Mcml and Swi4/Swi6/Mbpl. In addi- 
tion, genes with similar function can also cluster 
together and have meaningful explanation in the context 
of their TFs. Figure 4 shows three such functional clus- 
ters. The first one contains Gl/S and S phase functional 
genes which are simultaneously regulated by important 
Gl/S transcriptional regulators, SBF {Swi4/Swi6) and 
MBF (Mbpl/Swi6). The second contains DNA 
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replication-related genes which are mainly regulated by 
Mbpl. The third contains chromosome segregation- and 
budding-related genes which are mainly regulated by 
Swi4. These results are consistent with our knowledge 
about the function of SBF and MBF in the cell-cycle 
process. Hence, the structure of our transcriptional net- 
work revealed by the cluster analysis could also be used 
to infer functional relationships between genes. 

Potential cell cycle-related genes 

Based on E-MAP, expression, chip-chip, and protein 
phosphorelation data and the analysis above, we could 



identify PCCGs and know about its structure in tran- 
scriptional network (results are summarized in Table S6 
in Additional file 10), Among the PCCGs, we will intro- 
duce four genes with unknown function (Figure 5). The 
first one is YPL158C, which genetically interacts with 
PCL9, AMNl and BUD4. These four genes are all regu- 
lated by known TFs in M phase (including G2/M and 
M/Gl). The expression data show that YPL158C, PCL9 
and AMNl are simultaneously expressed and that their 
peak value of expression is later than that of ACE2 and 
SWIS, which are their transcriptional regulators, as well 
as BUD4 (Figure 6A). This is consistent with the 
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regulatory network because BUD4, ACE2, and 5^75 are 
mainly regulated by FKHl/2 and MCMl, while 
YPL158C is mainly regulated by ACE2 and SWI5. They 
all act in M phase or early Gl phase. Based on these 
observations, YPL158C is possibly involved in M phase 
and co-operates with PCL9, AMNl and BUD4. The sec- 
ond is YPR174C, which genetically interacts with CLN3 
and potential substrates of CDC28. YPR174C and CLN3 
are co-regulated by MBPl and XBPl, another known 
cell cycle TF, which was not selected in the procedure 
above because of low periodic rank (rank in PT: 120; 
rank in ET: 11; final rank: 34). According to the descrip- 
tion in SGA, XBPl is a member of the SWI4/MBP1 
family. Since MBPl and XBPl do not have significant 
periodic expression, we compared the expression of 
SWI4 with CLN3 and YPR174C. We found that SWI4 
and YPR174C are significantly co-expressed and that 
CLN3 and YPR174C also show a co-expression pattern, 
but with two time points lagging (Figure 6B). It is con- 
vincing to consider that YPR174C might be involved in 
Gl phase in cell cycle process since all other related 
genes are mainly acting in this phase. In addition, based 
on the transcriptional analysis above, YPRl 74C is mainly 
regulated by MBPl; hence, it is possibly involved in the 
DNA replication process. The third one is YJR054W, 
which genetically interacts with BUD4 and potential 
substrates of CDC28. In Chip-Chip data, BUD4 is regu- 
lated by MCMl. YJR054W is regulated by SWI4 and 
5^76 which are also regulated by MCMl. In the expres- 
sion data (Figure 6C), we found that the expressions of 



SWI4 and BUD4 are highly negatively correlated. This 
can be explained by the fact that MCMl participates in 
the formation of both repressor and activator com- 
plexes, and SWI4 and BUD4 may be regulated by differ- 
ent complexes. The expression of YJR054W is similar to 
SWI4 and slightly lags, which supports the regulation 
between them. Since BUD4 can influence the next 
round of budding and SWI4/6 mainly regulates the Gl 
phase, YJR054W may be involved in M/Gl phase and 
may co-operate with BUD4. The last one is YPR045C, 
which genetically interacts with CLN3, albeit negatively, 
and potential substrates of CDC28. It is regulated by 
HCMl and ABFl which both regulate S-phase during 
the cell cycle process. YPR045C is negatively correlated 
with HCMl (Figure 6D), which may suggest that HCMl 
suppresses the expression of YPR045C. Considering that 
CLN3 is Gl cyclin and activates Cdc28 kinase to pro- 
mote the Gl to S phase transition, we suggest that 
YPR045C could play a role during Gl and S phase. 

Discussion 

Our approach integrates the genetic interaction network, 
co-expression network, and transcriptional network, and 
it performed well in predicting cell cycle genes. Many 
previous papers have also discussing integrating these 
data source in eQTL analysis. However, comparing to 
these approaches, we started from known functional 
genes and their E-MAP profiles to build up the network 
step by step. In this process, we could see how these 
data source could describe the biology network, and 
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how they are co-operated together. We illustrate that E- 
MAP and DNA microarray could be complementary in 
identifying PCCGs, and also the cluster result tells that 
how transcriptional relationships could reflect functional 
connections of genes in the network. 

In addition, there are other types of networks, such as 
protein physical interaction networks, which are infor- 
mative for the prediction of gene function. However, 



because physical interactions annotated in databases are 
quite sparse between KCCGs and the 1536 library 
strains, we have not performed an analysis of it. We 
believe the efficiency of prediction can be increased 
when such data are integrated in a reasonable 
framework. 

Although the current study focused on the cell cycle 
process, our approach is not limited, and it can be easily 



Wang ef al. BMC Systems Biology 201 1, 5(Suppl 1):S9 
http://www.biomedcentral.eom/1752-0509/5/S1/S9 



Page 10 of 12 



applied to other biological processes, given the availabil- 
ity of data. 

Conclusions 

E-MAP technology is a powerful high-throughput tool 
to identify novel functional genes which genetic inter- 
acted with the known one. By screening forty eight cell 
cycle genes crossing 1536 library strains, E-MAP helps 
us obtain a large potential cell cycle-related gene set. 

Our analysis shows that genetic interaction and gene 
co-expression could be complementary for identifying 
co-functional gene pairs, and combining them has sig- 
nificantly improved the accuracy of the prediction. 

TF-DNA binding (chip-chip) and protein phosphoryla- 
tion data were used to construct a cell cycle regulation 
network. Periodic expressed and being enriched of cell 
cycle genes in targets can both be used to identify TPs 
which regulate the cell cycle process. When comparing 
the cluster result to prior knowledge, we could believe 
that our cell cycle transcriptional network is well con- 
structed. This network could help to illustrate how 
PCCGs are involved in cell cycle process. 

Finally, four genes with unknown functions in PCCGs 
are laboured. From KCCGs which the four genes are 
genetic interacted and co-expressed, we could predict 
which phase of cell cycle they may be involved in. In 
addition, the time course expression data of them are 
consistent with the constructed transcriptional network, 
and some of them are substrate of CDKl (CDC28) 
kinase which regulates the cell cycle process in budding 
yeast. All these analyses provided strong evidence that 
the four novel genes should be participate in cell cycle 
related process. 

Methods and materials 

E-MAP experiment data 

The 48 cell cycle genes were manually curate from the 
literature, which function in different phases of the cell 
cycle process (Figure S5 in Additional file 11). They had 
been screened by crossing a 1536 mutant strain library 
in budding yeast, and the relative double mutant strains 
were selected to obtain the EMAP data. However, the 
analytical framework we developed is not affected by the 
selection of these genes, and it can be applied to other 
processes as well. 



To measure the similarity between the time course 
expression profiles of two genes, we used the time- 
lagged correlation [19]. For multiple experiments, we 
adopted a loose definition for correlation between two 
genes which is the maximum time lag correlation score 
in all the eight experiments. This means that two genes 
showing high correlation in one experiment are consid- 
ered to be co-expressed. We can use such a loose defini- 
tion because we have already had a stringent constraint 
in E-MAP analysis to ensure that the interactions are 
reliable, even if two genes only show co-expression in 
one experiment. 

Comparison with SGA genetic interaction data 

The true genetic interactions we used here are pre- 
viously published interactions from Biogrid (http://www. 
thebiogrid.org). Similar to previous work, for negative 
genetic interactions, we also considered interactions 
annotated as phenotypic enhancement, synthetic growth 
defect and synthetic lethality. For positive genetic inter- 
actions, interactions annotated as phenotypic suppres- 
sion and synthetic rescue are used. By using S-score 
cut-offs, we calculated the number of true positives (TP) 
as the number of Biogrid interactions with S-scores 
more extreme than the cut-offs. As defined in previously 
published work, sensitivity is defined as the fraction of 
known interactions. 



sensitivity - 



TP 



TP + FN 



Precision is defined as the fraction of true interactions 
in the set of all predicted pairs. 



precision ■ 



TP 



TP + FP 



We also use hyper geometric distribution to calculate 
the p-value of precision. The relative results are 
reported in Table S2 in Additional file 2. 

Definition of p-values of enrichment analysis for 
biological process terms in GO 

We also use hyper geometric distribution to calculate p- 
value to measure the enrichment of one biological pro- 
cess in GO annotation as follows: 



Time course expression data and definition of correlation 

We use eight time course microarray experiments from 
four previously published works to perform the co- 
expression analysis [14-17]. The data were downloaded 
from the supplementary data from the authors' website, 
and the KNNImpute method was used [18] to impute 
the missing value. 



pi=p{mi,n,Mi,N) = ^ 



k>m: 



c" 



Here is the number of selected genes which have 
the function i; n is the number of selected genes; Mj is 
the number of test genes which have the function i; N is 
the number of test genes. 
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Finally, we use Bonferroni correction to control false 
discovery rate in this multiple testing problem and get 
the q value. 

Transcriptional Regulation and CDC28 substrate datasets 

The Chip-Chip data and wild type vs. TF mutant micro- 
array data were downloaded from YeastRact (http:// 
www.yeastract.com[20,21]). Among 183 TFs in our data- 
set, 37 are annotated as cell cycle-related in the MIPS 
database. The CDC28 substrate dataset was downloaded 
from the supplementary data of two previously pub- 
lished works [22,23]. 

Definition of p-values for periodicity and enrichment for 
cell cycle genes 

The significance of periodicity was previously defined 
[24,25]. The data were downloaded from http://www. 
cyclebase.org. We also used the hyper geometric distri- 
bution to calculate the p-value for the enrichment of 
cell cycle targets. 

p,=p(m,,n,M„N)= 

Here, is the neighbor in the PCCGs and KCCGs; n 
is the number of genes in PCCGs and KCCGs; Al, isT- 
Fi's targets in the test genes, and N is the number of 
test genes. 

Multiplication of ranks can represent "or" relationship 
between the two methods 

Suppose the probability of one TF, not to be enriched 
and periodically expressed is p],pf- Then the probabil- 
ity that TFi is either enriched or periodically expressed 
is. Pr . = 1 - pjpf . YorTFi, the multiplication of its rank 
of p-values (ascending order) r/r,^ keeps the order of 
the probability; p^p^ . Thus the smaller the order of 
rfr^ is, the larger Pr, of TP,- is. 

Additional material 



Additional file 1: 48 cell cycle query genes This file can be viewed 
with IVIicrosoft Excel Viewer. 

Additional file 2: Sensitivity and precision of E-MAP genetic 
interaction scores This file can be viewed with Microsoft Excel Viewer. 

Additional file 3: Correlation between our E-MAP data and 
published data This file can be viewed with Adobe Reader. 

Additional file 4: Ranks of p-values for top 25 TFs This file can be 
viewed with Microsoft Excel Viewer. 

Additional file 5: Cover rate when different TFs are selected This file 
can be viewed with Adobe Reader. 

Additional file 6: Cell cycle genes are enriched in the PCCGs and 

TFs This file can be viewed with Microsoft Excel Viewer. 

Additional file 7: CDC28 substrates are enriched in the PCCGs and 
TFs 



Additional file 8: Indirect TF-Target connection analysis This file can 
be viewed with Adobe Reader. 

Additional file 9: Cluster results of transcriptional network This file 
can be viewed with Adobe Reader. This file can be viewed with 
Microsoft Excel Viewer. 

Additional file 10: Summary of our analytical results This file can be 
viewed with Microsoft Excel Viewer. 

Additional file 11: Composition of the signaling E-MAP This file can 
be viewed with Adobe Reader. 
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